Hold this via WeakRef in IntersectionObserver callback#5821
Hold this via WeakRef in IntersectionObserver callback#5821srid wants to merge 1 commit intoxtermjs:masterfrom
Conversation
Downstream kolu (juspay/kolu) consumes this fork via pnpm.overrides github URL. The main fix/kolu-xterm-fixes branch is source-only; this branch adds the pre-built .mjs bundles so no build step runs at pnpm install time. Built output contains both fixes: - xtermjs#5817: register CursorBlinkStateManager + _pausedResizeTask - xtermjs#5821: WeakRef this in IntersectionObserver callback (xtermjs#5820)
…fork Bumps the pnpm override pointer from `fix/dispose-leaks-built` to `fix/kolu-xterm-fixes-built`, which stacks a second fix on top of the existing dispose-leaks patches: - xtermjs/xterm.js#5817 (`fix/dispose-leaks`): register CursorBlinkStateManager + _pausedResizeTask disposables. Already in production. - xtermjs/xterm.js#5821 (new, `fix/intersection-observer-weakref`): wrap `this` in a `WeakRef` inside `RenderService`'s IntersectionObserver callback. `observer.disconnect()` on our side wasn't releasing the callback in practice — heap snapshot showed 175,594 retained Uint32Arrays (~220 MB / 30 mode-toggles with 7 terminals) traced through the global IntersectionObserver registry → callback closure → RenderService → BufferService → BufferLines. WeakRef breaks that chain regardless of why the native registry held the callback. Local measurement on kolu@zest (fresh tab, 30 canvas↔focus toggles, 7 terminals restored): Before fix → After fix Task Manager Memory Footprint +367 MB → -3 MB BufferLine (Z1) instances Δ +175,594 → ~0 Upstream tracking: xtermjs/xterm.js#5820 (issue), xtermjs/xterm.js#5821 (PR). Fork consumption will collapse to a plain version bump once upstream merges + releases.
…fork Bumps the pnpm override pointer from `fix/dispose-leaks-built` to `fix/kolu-xterm-fixes-built`, which stacks a second fix on top of the existing dispose-leaks patches: - xtermjs/xterm.js#5817 (`fix/dispose-leaks`): register CursorBlinkStateManager + _pausedResizeTask disposables. Already in production. - xtermjs/xterm.js#5821 (new, `fix/intersection-observer-weakref`): wrap `this` in a `WeakRef` inside `RenderService`'s IntersectionObserver callback. `observer.disconnect()` on our side wasn't releasing the callback in practice — heap snapshot showed 175,594 retained Uint32Arrays (~220 MB / 30 mode-toggles with 7 terminals) traced through the global IntersectionObserver registry → callback closure → RenderService → BufferService → BufferLines. WeakRef breaks that chain regardless of why the native registry held the callback. Local measurement on kolu@zest (fresh tab, 30 canvas↔focus toggles, 7 terminals restored): Before fix → After fix Task Manager Memory Footprint +367 MB → -3 MB BufferLine (Z1) instances Δ +175,594 → ~0 Upstream tracking: xtermjs/xterm.js#5820 (issue), xtermjs/xterm.js#5821 (PR). Fork consumption will collapse to a plain version bump once upstream merges + releases.
…gles) (#617) ## Summary Fixes the canvas/focus-toggle memory leak that was pushing production pureintent to 1.2 GB Memory Footprint. The leak is upstream in `xterm.js`; this PR bumps the `pnpm.overrides` pointer to a fork branch that stacks one additional fix on top of `fix/dispose-leaks-built`. **Upstream tracking:** xtermjs/xterm.js#5820 (issue), xtermjs/xterm.js#5821 (PR). ## The leak `RenderService._registerIntersectionObserver` creates an `IntersectionObserver` whose callback closes over `this` directly. Although `_observerDisposable` calls `observer.disconnect()` on dispose, the retained callback chain keeps `this` (RenderService) alive in practice — along with `_coreService → _bufferService → buffers → BufferLine → Uint32Array` cell data. Heap-snapshot diff across 30 mount/unmount cycles of 7 `Terminal` instances: | Class | Δ count | Δ bytes | |---|---|---| | `native:system / JSArrayBufferData` | +175,594 | **+220 MB** | | `object:Uint32Array` | +175,594 | +10 MB | | `object:ArrayBuffer` | +175,594 | +9 MB | | `object:BufferLine` | +175,594 | +5 MB | Every retained `Uint32Array` traced through the same retainer signature: global `IntersectionObserver` registry → callback closure → `RenderService` → service graph → BufferLines. 175,594 = 30 toggles × 7 terminals × ~830 scrollback lines — basically the full buffer of every disposed `Terminal` pinned past `terminal.dispose()`. The fix wraps `this` in a `WeakRef`. Functional semantics preserved: while RenderService is alive, `deref()` returns it and the handler runs unchanged. Once no strong refs remain, the callback is a no-op and the BufferService graph can GC. ## Ground-truth measurement Local `kolu@zest` (fresh tab, 30 canvas↔focus toggles, 7 terminals restored from session): | Metric | Before | After | |---|---|---| | Memory Footprint Δ / 30 toggles | **+367 MB** | **−3 MB** | | JS live Δ | +66 MB | ~0 | | BufferLine instances retained | +175,594 | ~0 | Also verified the `fix/dispose-leaks` fix (upstream #5817) still ships — the stacked fork branch contains both. ## Alternatives considered - **Chase why `disconnect()` isn't releasing the callback.** Could be DevTools instrumentation, a Chrome extension patching `window.IntersectionObserver`, a native registry quirk — unclear, and likely environment-dependent. The WeakRef wrap is defensive and preserves semantics regardless of root cause. - **Null out `_coreService` / `_bufferService` in `RenderService.dispose()`.** Narrower scope but requires a dispose override. WeakRef is cleaner — no new dispose path, just a smaller capture surface. ## Test plan - [x] `just check` passes - [x] Local fresh tab + 30 mode-toggles: Memory Footprint flat (verified via Chrome Task Manager) - [x] Deploy to `pureintent`, retest footprint delta on production prod-like build - [ ] Upstream xtermjs/xterm.js#5821 lands → drop fork override for this specific fix in a follow-up 🤖 Diagnosis assisted by [Claude Code](https://claude.com/claude-code)
Blog post tracing the IntersectionObserver / BufferLine retention story — wrong turn in #614, byte-delta heap diff as the pivot, the one-line WeakRef patch in xtermjs/xterm.js#5821. Written in the voice of the debugging-story canon: cold open on the symptom, the wrong turns compressed rather than storied, anticlimactic fix. Targets readers from backend / systems backgrounds who may not be fluent in web-frontend memory tooling — links to MDN and Chrome DevTools docs for the web-specific concepts. Credit: I drove; Claude Code (https://claude.com/claude-code) did the agent-side work. Under docs/perf-investigations/ alongside the technical chapters in memory-learnings.md.
Draft blog post for review. Push revisions as additional commits for iteration. Preview on GitHub by clicking the file in the diff, or view raw for a cleaner markdown render: - [docs/perf-investigations/the-leak-that-wasnt-in-any-context.md](https://github.com/juspay/kolu/blob/docs/leak-blog-post/docs/perf-investigations/the-leak-that-wasnt-in-any-context.md) Shape of the post: - Intro — what Kolu is (cockpit for coding agents), canvas mode (linked to [@sridca tweet](https://x.com/sridca/status/2044953014100726221)), symptom - Wrong turn — compressed, not storied: six commits chasing the `Context` count proxy on #614, 89% reduction, zero Task Manager movement - What I was actually measuring — brief explainer on Task Manager vs `performance.memory` for readers who aren't web-perf fluent - The one-line fix — byte-delta diff, retainer chain, WeakRef diff - The xterm.js side — highlights both contributions (xtermjs/xterm.js#5817 + xtermjs/xterm.js#5821) as "laughably small, hours to find" - What I'd tell past-me — three takeaways aimed at backend/systems readers - Links to master-branch docs/skill/scripts for anyone who wants the full investigation record Attribution: I drove; [Claude Code](https://claude.com/claude-code) did the agent-side work. Not for merge as-is — draft is to iterate on wording before deciding where it's actually published.
| const weakSelf = new WeakRef(this); | ||
| const observer = new w.IntersectionObserver( | ||
| e => weakSelf.deref()?._handleIntersectionChange(e[e.length - 1]), | ||
| { threshold: 0 } | ||
| ); | ||
| observer.observe(screenElement); | ||
| this._observerDisposable.value = toDisposable(() => observer.disconnect()); |
There was a problem hiding this comment.
I'm surprised this isn't handled by observer.disconnect() dropping the refs to the callback. Is there a way to solve this without the introduction of WeakRef?
There was a problem hiding this comment.
To me this smells like a browser bug and might be better addressed in chrome directly. Should also be tested, whether it is reproducible on other browser engines.
There was a problem hiding this comment.
Checked the w3c spec - disconnect does indeed not nullify the callback, in fact the observer object holds onto the callback to be reusable with new .observe(some_element) calls. So this is indeed specced like this.
Would it help to just throw away the observer object itself for the GC to clean the callback up?
There was a problem hiding this comment.
Yepp (and to make sure, the ref gets not bound elsewhere).
There was a problem hiding this comment.
That would be preferable to a WeakRef, every time I've thought I needed it there was something simpler like this.
There was a problem hiding this comment.
Confirmed this to help too.
So, we can close this PR and choose #5831 instead.
The IntersectionObserver callback in RenderService closes over `this` directly.
On dispose, `_observerDisposable` calls `observer.disconnect()` — which in
theory releases the callback. In practice, across 30 mount/unmount cycles of
7 terminals in a multi-tile app, the retainer chain
Window.IntersectionObserver (registry)
-> callback closure
-> RenderService (this)
-> _coreService -> _bufferService -> buffers -> lines -> BufferLine -> Uint32Array
kept 175,594 BufferLine Uint32Arrays alive (+220 MB of native ArrayBuffer
bytes) past dispose. Whether this is a Chrome DevTools wrapper, an extension
patching window.IntersectionObserver, or something subtler in the native
registry — either way, `disconnect()` on our side wasn't enough to free the
callback in that environment.
Wrap `this` in a WeakRef so the callback holds only a weak reference.
Functional behavior is preserved: while RenderService is alive, `deref()`
returns it and the handler runs. Once no strong refs remain, the callback
becomes a no-op and the BufferService graph is free to GC — which is what
the outer `_observerDisposable` was supposed to guarantee but doesn't in
practice.
Refs xtermjs#5820
1ea3189 to
0964ea4
Compare
|
@srid In juspay/kolu#652 you write, that the callback retention is not reproducible locally with chrome-devtools MCP (shows always benign behavior), but with normal chrome + devtools it is and that only the weakref counters it effectively. Can you please repeat the tests with devtools not opened, e.g. by adding the async snippet on toplevel and let it trigger the toggles by a timeout from normal load? then inspect the memory footprint again. Just want to make sure, that this is not a devtools instrumentation bug (devtools exhibits several of those, esp. during perf testing with screwed up timer events and overly big websocket load). |
@jerch This was only because Kolu had already moved to an unified infinite canvas model (no more two-mode switching). And I had to go back in history to old model, and reproduce it there. Sharing results shortly, in the thread above. |
Summary
Fix for #5820.
RenderService._registerIntersectionObservercreates anIntersectionObserverwhose callback closes overthisdirectly. Although_observerDisposablecallsobserver.disconnect()on dispose, the retained callback chain keepsthis(RenderService) alive in practice — along with_coreService → _bufferService → buffers → BufferLine → Uint32Arraycell data.Wrap
thisin aWeakRef. While RenderService is alive,deref()returns it and the handler runs unchanged. Once no strong refs remain, the callback becomes a no-op and the BufferService graph can GC.Evidence (from #5820)
Heap-snapshot diff across 30 mount/unmount cycles of 7
Terminalinstances in a multi-tile app:native:system / JSArrayBufferDataobject:Uint32Arrayobject:ArrayBufferobject:BufferLineEvery retained
Uint32Arraytraced to the same retainer chain — globalIntersectionObserverregistry → callback closure →RenderService→ service graph → BufferLines. After the WeakRef fix, production-equivalent local testing shows Memory Footprint flat across 30 mode-toggles (was +367 MB/30-toggles before).Why not just fix
disconnect()?Unclear why
observer.disconnect()isn't releasing the callback in the observed environment — could be DevTools instrumentation, a Chrome extension patchingwindow.IntersectionObserver, a native registry quirk, or something else. The WeakRef wrap is a defensive fix that breaks the retention chain regardless of the underlying cause and preserves functional semantics exactly.Test
No behavioral change to validate beyond existing renderer-pause-on-visibility tests. A regression would manifest as the pause handler not firing when visibility changes; while RenderService is strongly rooted (which it is for any live terminal),
deref()always returns the instance.Closes #5820